Analysis of Source Code Repositories
نویسندگان
چکیده
Source code repositories are designed to store a huge amount of source code. They also collect indirectly information useful to analyze the development process. Usually, the last set of data is not used at all due to the lack of specialized tools to collect and analyze such data. This paper presents the early stages of a tool designed to perform acquisition and analysis of data stored in source code repositories.
منابع مشابه
Topic modeling of public repositories at scale using names in source code
Programming languages themselves have a limited number of reserved keywords and character based tokens that define the language specification. However, programmers have a rich use of natural language within their code through comments, text literals and naming entities. The programmer defined names that can be found in source code are a rich source of information to build a high level understan...
متن کاملSource Code Reuse Analysis in Multiple Projects based on the Clone Genealogy
In the software industry and OSS projects, it is said that source code reuse could improve productivity and reliability of software development, and reduce development time. On the other hand, source code reuse requires professional skills to developers. Ad-hoc reuse might introduce some maintenance problems. The source code reuse analysis for software development organizations is worthy to be ...
متن کاملTowards an Analysis of Who Creates Clone and Who Reuses it
Code clone analysis is valuable because it can reveal reuse behaviours efficiently from software repositories. Recently, some code reuse analyses using clone genealogies and code clones over multiple projects were conducted. However, most of the conventional analyses do not consider the developers’ individual difference to reuse behaviors. In this paper, we propose a method for code reuse analy...
متن کاملMining the Categorized Software Repositories to Improve the Analysis of Security Vulnerabilities
Security has become the Achilles’ heel of most modern software systems. Techniques ranging from the manual inspection to automated static and dynamic analyses are commonly employed to identify security vulnerabilities prior to the release of the software. However, these techniques are time consuming and cannot keep up with the complexity of ever-growing software repositories (e.g., Google Play ...
متن کاملPublic Git Archive: a Big Code dataset for all
The number of open source software projects has been growing exponentially. The major online software repository host, GitHub, has accumulated tens of millions of publicly available Git versioncontrolled repositories. Although the research potential enabled by the available open source code is clearly substantial, no significant large-scale open source code datasets exist. In this paper, we pre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003